Data Journeys

Informações:

Synopsis

Data Journeys is a podcast for aspiring Data Scientists by AJ Goldstein, where he interviews world-class Data Scientists about their learning journeys. The focus is on how theyve bridged the gap between acquiring technical skills and creating real-world impact. In each episode, the goal is to equip up-and-comers with the strategies, tactics, and tools that the best in the world have used to get to where they are today.

Episodes

  • #25: Laura Noren: The Ethics of Data Science

    03/12/2018 Duration: 55min

    Laura Noren is a data science ethicist and researcher currently working in cybersecurity at Obsidian Security in Newport Beach. She holds undergraduate degrees from MIT, a PhD from NYU where she recently completed a postdoc in the Center for Data Science. Her work has been covered in The New York Times, Canada's Globe and Mail, American Public Media's Marketplace program, in numerous academic journals and international conferences. Dr. Norén is a champion of open source software and those who write it.   Enjoy the show!   Show Notes:   [3:55] Laura explains how she produces the Data Science Community Newsletter, covering things like how the department of defense just got billions in funding to do AI research. How do you incorporate humor into such rigorous coverage? [10:22] How can you distinguish signal from noise in choosing a news source? [12:13] When and how to control your biases in your work when in the heat of the moment. [14:05] Laura’s interests in data science began as an undergraduate at MIT,

  • #24: Brian McFee: Music and Data Science

    27/11/2018 Duration: 59min

    Dr. Brian McFee develops machine learning tools to analyze multimedia data. This includes recommender systems, image and audio analysis, similarity learning, cross-modal feature integration, and automatic annotation. As of Fall, 2014, he is a data science fellow at the Center for Data Science at New York University. Previously, he was a postdoctoral research scholar in the Center for Jazz Studies and LabROSA at Columbia University.   My conversation with Brian today was focused on discussing his research in music informatics and its many facets and applications. He tells about some of the methods he used during his dissertation, and I ask him for insight on how to get a recommender system to recommend stuff that you actually like.   Here are some of the highlights of the show:   [3:17] What came first for Brian, the data science or the music? [5:19] Of all the things he could have chose to study, why did Brian choose music? [7:35] What is it like to be in a branch of data science that has become so closely

  • #23: Wes McKinney - The Creator of Pandas

    19/11/2018 Duration: 01h10s

    Wes McKinney is the creator and "Benevolent Dictator for Life" (BDFL) of the open-source pandas package for data analysis in Python, and has also authored two versions of the reference book Python for Data Analysis. Wes is also one of the co-creators of the Apache Arrow project, which is currently his main focus. Most recently, he is the founder Ursa Labs, a not-for-profit open source development group in partnership with RStudio.   He describes himself as a problem-solver, and is particularly interested in improving the usability of data tools for programmers, accelerating data access and in-memory data processing performance, and improving data system interoperability.   In my conversation with Wes today, we focused on getting to know Wes on a more personal level, discussing his background and interests to get some insight into the living legend of open source he has become.   [3:48] How did coming from four generations of newspaperman impact Wes’s upbringing? [6:00] What kind of hobbies was he interested

  • #22: Mike Tamir: Identifying Fake News with the Head of Data Science at Uber ATG

    13/11/2018 Duration: 55min

    Mike Tamir is the Head of Data Science at Uber ATG. He is a leader in data science, specializing in deep learning and distributed scalable machine learning, and he’s also a faculty member at UC Berkeley.   Mike has led several teams of Data Scientists in the San Francisco Bay Area as Chief Data Scientist for InterTrust and Formation, Director of Data Sciences for MetaScale, and Chief Science Officer for Galvanize, where he oversaw all data science product development. He also created an MS degree program in Data Science in partnership with UNH.   Mike began his career in academia serving as a mathematics teaching fellow for Columbia University and graduate student at the University of Pittsburgh. His early research focused on developing the epsilon-anchor methodology for resolving both an inconsistency he highlighted in the dynamics of Einstein’s general relativity theory and the convergence of “large N” Monte Carlo simulations in Statistical Mechanics’ universality models of criticality phenomena.   The focu

  • #21 Frank Diana- The Future of AI - Predicting Preparing Thriving in Our Changing Future

    27/08/2018 Duration: 01h07min

    Frank Diana is a recognized futurist, thought leader and frequent keynote speaker. He has served in various executive roles throughout his career and has over 30 years of leadership experience. Currently at Tata Consultancy Services, he is focused on leadership dialog in the context of our emerging future and its implications on business, society, governments, economies, and our environment. He blends a futurist perspective with a pragmatic, actionable approach, leveraging horizon scanning and storytelling to see possible futures and drive foresight into leadership deliberation.   His leadership experience obtained through various executive roles connects practical realities with the need to focus on an emerging future filled with complexity and change. A strong ability to connect dots enables the identification of future scenarios quickly and broadly, with an ability to see implications years into the future.   The conversation with Frank centered around his research which focuses on scanning the horizon for

  • #20: Kyle Polich: Skepticism and Simplifying Complex Topics with the Host of the Data Skeptic Podcast

    20/08/2018 Duration: 01h06min

    Kyle Polich is the co-host of the incredibly popular Data Skeptic podcast, which he has been churning out since 2014. He studied computer science and focused on artificial intelligence in grad school. His general interests range from areas like statistics, machine learning, data viz, and optimization to data provenance, data governance, econometrics, and metrology.   The Data Skeptic Podcast features conversations on topics related to data science, statistics, machine learning, and artificial intelligence. The podcast breaks down into two different episode formats. One is a short form podcast where Kyle explains complex data science concepts in a way that non-data scientists can understand. In these episodes he’s joined by his co-host and wife, Linh da Tran. The second format is a long form interview format where Kyle interviews experts in various data science and skepticism related arenas about their work.   In this episode of the Data Journeys Podcast, I pick Kyle’s brain for patterns noticed and lessons le

  • #19: Emily Glassberg Sands: Equalizing Access to Rewarding Careers as Head of Data Science at Coursera

    13/08/2018 Duration: 01h14min

    Emily Glassberg Sands is the Head of Data Science at Coursera - the largest online learning platform for higher education with 35M learners from around the world. Her team leads the quantitative measurement, experimentation, and inference that informs Coursera’s product and business direction.   Emily received her Bachelor's’ degree in Economics from Princeton University, and then moved on to complete her Ph.D. in Economics from Harvard University. At Harvard, her research focused on experimental and applied methods to better understand labor markets and consumer decision-making.   An economist by training, Emily loves using data to build better, smarter products that have a positive impact on society. In this interview, we discuss the insights into Emily has found in her work at Coursera and how they can be applied to give everyone in the world equal access to education. We covered topics like:   Emily’s journey to a career in science, and how she went from from Montessori school, to the farms of Montana, t

  • #18: Dan Hammer: Democratizing Environmental Data at the White House, NASA, National Geographic, and More

    06/08/2018 Duration: 01h01min

    Dan Hammer is an environmental economist and winner of the 2017 Pritzker Prize for the Environment. Currently he serves as a National Geographic Fellow and the co-founder of Earthrise Media, and throughout 2016, he was the Senior Policy Advisor to the U.S. Chief Technology Officer, Megan Smith, as part of the Obama Administration.   Before arriving at the White House, Dan was the Presidential Innovation Fellow that released the first API listing for NASA. Prior to NASA, Hammer was the Chief Data Scientist at the World Resources Institute, where he helped re-launch Global Forest Watch, an open-source project to monitor deforestation.   After graduating from Swarthmore College in 2007 with high honors in mathematics and economics, and before receiving his PhD in environmental economics from the University of California, Berkeley, Dan was a Thomas J. Watson Fellow and traveled to Polynesia to build and race outrigger canoes. Today, among many other amazing mentors, he continues to works with Steve McCormick

  • #17: Decoding Healthy Meals & Learning the World of Food with Sivan Aldor-Noiman and Erik Andrejko @ Wellio

    30/07/2018 Duration: 54min

    In this episode of the Data Journeys podcast, we have not one, but two guests!   Sivan Aldor-Noiman and Erik Andrejko join me from Wellio: an intelligent platform that uses machine learning and behavioral science to help people plan, shop, prepare and enjoy healthy meals at home.   Wellio is on a mission to decode how meals are prepared and enjoyed at home, both on an individual-level in terms of people’s preferences and on a global-level in terms of semantic & nutritional understanding of food.   Quite interestingly, Sivan began her career in the Israeli army, serving as an instructor for an anti-tank missile unit. Afterwards, she transitioned to school and received her undergraduate degree in Industrial Engineering and a Master in Statistics from the Technion Israel Institute of Technology. At 26 years old she moved to the U.S. to complete a PhD degree in Statistics from The Wharton Business School at the University of Pennsylvania. Since then, she’s spent the majority of her career at The Climate Corp

  • #16: Jim Guszcza: Chief Data Scientist at Deloitte Consulting

    23/07/2018 Duration: 57min

    Jim Guszcza is the US Chief Data Scientist at Deloitte Consulting. Deloitte is the largest professional services network in the world in terms of revenue, with over 263,900 professionals globally.   Jim has been with Deloitte since 2001, where he took the lead on using behavioural nudge tactics to effectively act on model indications and prompt behavior change. Since then, he has gained extensive experience in applying predictive analytics to a variety of public and private sector domains.   In addition to his work with Deloitte, Jim is also an author and former professor at the University of Wisconsin-Madison business school. He received his PhD in Philosophy from the University of Chicago.   In this interview, we explore how philosophy and nudge theory can be applied to change human behavior using data science. Some highlights include:   How growing up a sci-fi fan led Jim to pursue science and philosophy as a career. The importance of a philosophy education and how it can train you to analyze issues you e

  • #15: Travis Oliphant: Creating, Evolving, & Funding Open-Source Software

    16/07/2018 Duration: 01h05min

      Travis Oliphant is the Founder & CEO of Quansight: a company that bridges open-source communities and innovative companies by growing talent, building technology, and discovering new products. For years Travis has been an indispensable contributor toward data science’s open-source movement through so many different outlets: Founder, Director, & Former CEO @ Anaconda, Inc: a free and open source distribution of over 250 popular data science packages for Python and R, used by over 6 million users. Founder, Chairman of the Board @ NumFOCUS Foundation: the world-renowned open-source community promoting open code development and reproducible scientific research. President @ Enthought: a software company best known for the early development and maintenance of the SciPy stack. Creator of NumPy, SciPy, Numba, & XND: all invaluable open-source Python libraries Before founding Continuum Analytics (later renamed to Anaconda) in 2012, Travis received a Ph.D. from the Mayo Clinic, B.S. and M.S. degrees in

  • #14: Drew Conway – Applying Data Science to Where It’s Needed Most

    09/07/2018 Duration: 01h11min

    Drew Conway is a world-renowned data scientist, entrepreneur, author, and speaker, perhaps most well-known for his infamous 2010 “Data Science Venn Diagram”. Today, Drew is the Founder & CEO of Alluvium: a company using machine learning and AI to turn massive streams of data produced by industrial operations companies into insights that bridge the gap between big data and human expertise. Designed with the goal of helping industrial operations become safer, more efficient and more profitable, the Alluvium platform makes industrial machine data meaningful and useful to the people who rely on it to make decisions that affect the stability of their operation. Before starting Alluvium in 2015, Drew helped start: Data Gotham: an organization focused on supporting the NYC data community, with an annual conference bringing together people from all industries DataKind: a non-profit that brings high-impact organizations together with leading data scientists to use data science in the service of humanity. They en

  • #13: Anthony Scriffignano: Developing Cultural Awareness & Global Perspective Through International Data Science

    02/07/2018 Duration: 51min

    Anthony Scriffignano is the Chief Data Scientist at Dun & Bradstreet: a publicly traded company that provides commercial data, analytics and insights about businesses through their database of more than 290 million business records worldwide. Most commonly known as D&B, the company was originally founded 176 years ago in the horse-and-buggy days of 1841, and today, is headquartered in Short Hills, NJ, with over $2.5 billion in assets. Anthony has been at D&B since 2002 and, with over 35 years of experience in information technologies, Big-4 management consulting, and international business, is well-regarded as an international thought-leader in data science. Today he leads a team of data scientists focused on advancing Dun & Bradstreet's core capabilities and IP globally. With extensive background in advanced algorithms and linguistics, he holds multiple patents and presents globally on data and technology trends, multilingual challenges in business identity, and artificial intelligence. In t

  • #12: Favio Vázquez - Data Science Polymath

    25/06/2018 Duration: 01h29min

    Favio Vazquez is the Principal Data Scientist at OXXO: Mexico’s largest convenience store chain with over 17,500 locations. In addition to his work at OXXO, Favio brings new meaning to the term “polymath” by simultaneously holding 5 other related positions at AI companies in Mexico: Senior Data Scientist @ Raken Data Group Chief Data Scientist @ Iron AI Creator of the Ciencias y Datos online course Data Science Lecturer @ Afi Escuela de Finanzas Collaborator @ CosmoSIS With such a wide-ranging background, we covered a lot in this conversation, including: Favio’s experience growing up in Venezuela, the childhood influencers that played a core part in shaping him into who he is today Tricks like “frontloading” and “batching” that Favio uses to juggle many projects at once How individuals looking to get hired as a company’s first data scientist should explain the value that data science can provide, what language to use and terminology to avoid His approach for determining which questions would yield the most

  • #11: Dan Wulin: International E-Commerce, Price Optimization, & Home-Good Product Recommendations

    18/06/2018 Duration: 01h05min

    Dan Wulin is the Director of Data Science at Wayfair: an international e-commerce company specializing in home goods. Wayfair is a $5B company growing 40% year-over-year, with 10 million products and over 8,700 employees around the world. Their data science team is 80-people strong and growing fast, using econometrics to optimize prices, biostatistics to boost marketing, and computer vision to personalize product recommendations. Prior to joining Wayfair, Dan studied Math & Physics as an undergraduate at Columbia University and, thereafter, received his PhD in Theoretical Physics from the University of Chicago. Coming out of school, he worked as a consultant at Boston Consulting Group for a year in Chicago before transitioning to Wayfair in Boston. In this conversation, we cover a wide-range of topics, including: His childhood obsession with text-based multiplayer RPGs like Gemstone Dan’s roots in academia, how studying the physics of superconductors taught him (painfully) how to break down complex probl

  • #10: Jure Leskovec - Chief Scientist of Pinterest

    11/06/2018 Duration: 01h31s

    Jure Leskovec is the Chief Scientist of Pinterest, an $11 billion dollar company hosting over 75 billion idea “pins” from it’s 175 million monthly users worldwide. Jure originally arrived at Pinterest in 2014 when his company, Kosei, was acquired after starting a “recommendation revolution” through smarter, personalized mobile ads.  When Jure is not “turning cameras into keyboards” at Pinterest -- Fast Company’s “2nd most innovative AI company” -- he can also be found fulfilling his responsibilities as a: Associate Professor of Computer Science at Stanford University - where his research focuses on mining and modeling large social and information networks, including relationship graphs and chain effects in online community settings Investigator at the Chan-Zuckerberg Biohub - a multidisciplinary research organization on a mission to make all diseases preventable, manageable or curable by the year 2100 Some favorite topics we covered include: How being Pinterest’s Chief Data Scientist has affected his own s

  • #9: Daniela Huppenkothen – Astronomy, Cosmology, & The Study of Space

    04/06/2018 Duration: 01h23min

    Daniela Huppenkothen is the Associate Director at the DIRAC Institute of University of Washington, home to a diverse team of researchers in astrophysics and cosmology.  Before arriving at DIRAC, she has studied space & data science as a: Moore-Sloan Data Science Fellow of NYU - a multi-dimensional and independent research programs with impact in several scientific domains PhD recipient at University of Amsterdam - where she researched high & low energy astrophysics and the relationship between the two communities, namely “magnetar bursts”. Blogger of Hackathons, Data Science, and Academia - Daniela takes on a number of issues and shares her wise life lessons along the way.   In this conversation, we cover a wide-range of topics, including: The all-too-common (but remarkably important) mix up between astrology and astronomy How data science has a place in astronomy, with telescopes more powerful than several hundred iphones Frustrations with academia, a lack of validity when it comes to standardize

  • #8: Chris Albon – Connecting Africa to the Internet & Defending Public Discourse

    14/05/2018 Duration: 01h27min

    Chris Albon is the Chief Data Scientist of BRCK: a Kenyan startup building a network dedicated to connecting Africa to the internet, and author of the Machine Learning with Python Cookbook. For years he’s been contributing to the data science world through so many different outlets:   Cofounder of New Knowledge AI - an social media platform focused on protecting companies from disinformation, fighting fake news, and defending public discourse Former host of Partially Derivative - a popular podcast mixing explorations into data science techniques with discussions in the field’s leading experts. Content creator of Machine Learning Flashcards  - simplified, easy to digest flashcards for otherwise-complex machine learning concepts.   Blog writer at chrisalbon.com - providing some of best (and definitely most wide-ranging) technical notes out there on machine learning, statistics, deep learning, Python, and so much more. Our conversation went many places, including: How early childhood experiences (inclu

  • #7: Fernando Perez — Creating IPython, Founding NumFOCUS, & The Stories Behind It All

    07/05/2018 Duration: 52min

    Fernando Perez is best-known as the creator of IPython and co-founder of Project Jupyter: a set of open-source data science tools that some may consider to be the equivalent of the bat & ball to the sport of baseball. Today, you really can’t play the game of data science without Jupyter Notebooks and our guest today is one of Jupyter's leads and originators (see here for the rest of the amazing team). Fernando is also an Assistant Professor in Statistics at UC Berekely, Researcher at the Berekely Institute for Data Science, and Founding Board Member of the NumFOCUS foundation — the community that creates the SciPy stack, along with virtually every other notable open source data science tool out there. This conversation was recorded in-person with Fernando in his office on UC Berekely’s campus, and it turned out to be the most humanizing, energizing, and down-to-earth interview I’ve had so far. Some of the many topics we covered include: what Fernando wanted to be while growing up in Medellin (Me-de-jean)

  • #6: Andrew Ng — Globalizing Education, Disrupting Industries, & Generalizing Artificial Intelligence

    11/04/2018 Duration: 44min

    Andrew Ng is an Adjunct Professor at Stanford University and nothing short of a giant in the data science, machine learning, and artificial intelligence world. For the past decade he’s been shaping the way we live and learn. Four recent examples include: Co-Founder of Coursera: an education platform that offers online courses from top universities across the world Chairman of the Board for Woebot: a chatbot that’s currently revolutionizing mental health care Creator of DeepLearning.AI: a series of specialization courses created to help beginners break into the field of AI Former Chief Scientist at Baidu AI Group: basically the Google of China This conversation was recorded in-person with Andrew in his office on Stanford’s campus in Palo Alto, California. We covered a ton of different topics, including: the goals of Andrew's new $175M AI Fund how he plans to revolutionize manufacturing strategies for ML practitioners to tighten their feedback loop what differentiates the best businesses like Amazon, Faceboo

page 1 from 2